17. Solution: Diagnosing Cancer

16 Solution Diagnosing Cancer V3

The graph below is a histogram of the predictions our model gives in a set of images of lesions, as follows:

  • Each point in the horizontal axis is a value p from 0 to 1.
  • Over each value p, we locate all the lesions that our classifier predicted to have probability p of being malignant.

Here we have graphed the thresholds at 0.2, 0.5, and 0.8. Notice how:

  • At 0.2, we classify every malignant lesion correctly, yet we also send a lot of benign lesions for more testing.
  • At 0.5, we miss some malignant lesions (bad), and we send a few benign lesions for more testing.
  • At 0.8, we correctly classify most of the benign lesions, but we miss many malignant lesions (very bad).

So in this case, it's arguable that 0.2 is better.

However, for this model, there may even be a better value for the threshold. What would it be?

Threshold quiz

What would be the perfect value for the threshold in this model?